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Abstract. This paper examines signal detection in the presence of noise, with 
a particular emphasis to the nuclear activation analysis. The problem is to decide 
what between the signal-plus-background and no-signal hypotheses fits better the 
data and to quantify the relevant signal amplitude or detection limit. Our solution 
is based on the use of Bayesian inferences to test the different hypotheses. 
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1. Introduction 

For signals buried into noise, to decide between the detected and non-detected 
statements is a long debated problem; in addition, any non-detected decision 
must include a detection-limit statement. For instance, in analytical chemistry, 
the detection limit is defined as the lowest quantity of a substance that can be 
distinguished from no substance at all to within a stated confidence limit jTj. The 
orthodox approach to the estimate of detection limits [2] is based on the concept of 
confidence interval and of its interpretation as outlined in seminal papers by Neyman 
[3l 11] . We investigate an alternative approach that uses Bayesian inferences to test 
the different hypotheses and to quantify the signal amplitude or detection limit. 

Using the nuclear activation analysis as an example, that is, the detection of 
the nuclear activity of a radioisotope in a background photon-count, we illustrate 
how Bayesian inferences can be used to chose between the signal-plus-background and 
no-signal hypotheses and to quantify the signal amplitude or detection limit. The 
contaminant amounts are linked to what it is observed - the photon numbers in given 
energy bins - by calibration factors. The sampling statistics applies to the counts 
and, therefore, our paper deals only with the observed signal and its associated noise, 
but the conclusions can be easily extended to the concentrations. As regards as the 
terminology, the background count is what would be observed in a non-contaminated 
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sample, the gross count is what is actually observed, and the net count is what would 
be observed in the absence of background. The term measurand will indicate the mean 
net-count, whereas the terms background- and gross-signal will indicate the expected 
background- and gross-count. 

According the Neyman's view, the detection limits evaluated from the results 
of a large set of repeated measurement must bound a fixed measurand value with 
a given frequency. The detection-limit calculation uses hypothesis testing and the 
distributions of the measurement results given opposite hypotheses. Firstly, the 
sampling distribution of the background is used to establish a critical limit Lc such 
that, if the measurand is zero and the count is only noise (null hypothesis), a net 
count smaller than Lc would be obtained with a high probability, say 95%. Next, 
this statement is reversed by choosing the detection limit Ljj in such a way that, if 
the measurand is more than Ljj (alternative hypothesis) , a net count greater than Lc 
would be obtained with a high probability, say 95%. 

When the measurand value matters, this frequency-of-occurrence view is not 
enough. For instance, decisions require probability assignments to propositions that 
assert the measurand value and, in turns, they require the application of the Bayes 
theorem [3 [71 |H] . In the Bayesian approach, signal detection and signal estimation 
are not independent problems and, in a large set of equal measurement results, 
the detection limit must bound different measurand values with a given frequency. 
Hypothesis test requires to compare the probability of each hypothesis is true, given 
the data; hence, the detected or non-detected choice is done according to the maximal 
probability [9]. Only after such a choice has been done representative values - 
for example, the mode, mean, or median - and confidence intervals - for example, 
bounding the measurand with a 95% probability - can be calculated. 

2. Data model 

Measurements of the impurity concentrations of the ^^Si crystal used for the 
determination of the Avogadro constant [TOl E] are essential to prevent biased results 
or underestimated uncertainties. The existing literature indicates that Si crystals are 
extremely pure, but, to obtain a direct evidence of purity, we developed an analytical 
method based on neutron activation [T^. Nuclear activation analysis is based on the 
detection and counting of the 7 rays emitted by the radioactive isotopes produced 
by the neutron irradiation. When a neutron is captured by a nucleus, a compound 
nucleus is formed in an excited state. This step is followed by a prompt de-excitation 
to a more stable configuration; the new nucleus is usually radioactive and will de-excite 
by emitting delayed 7 rays or particles. In the last case the resulting nucleus is often 
still exited and a further 7 emission could occur. The energy spectrum of the emitted 
7 rays shows discrete peaks, which identify and quantify the radioactive nuclei and, 
consequently, the parent contaminants. After calibration against a known amount of 
contaminant, the number of counts stored in the energy bins relevant to peak gives 
the impurity contents of the sample. 

The gross count no recorded in a given time interval in any bin of the multichannel 
analyzer includes a background count ne ; in addition, owing to the high purity of the 
Si sample, for almost all the elements, the net count, if any, is deeply buried in the 
background. To extract all the available information, we assume that the riQ and '^b 
data are independent random-numbers. Hence, had the background and gross signals 
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Figure 1. Left: 95% critical limit (lower curve) and 5% detection limit (upper 
curve) calculated according to the Currie's construction for a net signal buried 
in the background noise. The shaded area is the 95% quantile of the background 
noise. Right: 5% (lower line) and 95% (upper line) quantiles for n = uq — riB, 
when Ab = 20. The arrow indicates the Neyman's 90% confidence interval for A 
when n = 15. 



been Ab and Aq, their sampling statistics, 

Pg,b("-g,b|Ag,b) = — ■ j (la) 

nG,B' 

are Poisson distributions having means Ab and Aq and their joint samphng 
distribution is 

PBG(nB,nG|AB,AG) ^ ^ " , ■. . (16) 

The problem is, firstly, to decide between the detected and non-detected statements 
and, secondly, to quantify the net signal A = Ag — Ab or its detection limit. 

3. Classical analysis 

The Currie's construction of the detection limit is as follows The distribution of 
the minimum- variance unbiased estimate n = uq — tib of A is the Skellam probability 
density [3 US] 



Pdfski(n|AG,AB)=e-(^'^+^B)i^(^2v/A;^)^^) , (2) 



A 



where I„(a;) is the modified Bessel function of the first kind and the mean and variance 
of the net count n are (n) = Ag — Ab = A and cr^ = Ag + Ab, respectively. Hence, 
provided Ab is known - which is a crucial assumption, the critical limit, Lc — [a^], is 
the smallest integer greater than or equal to the solution of 

Cdfski(a;|AB,AB) = a, (3) 

where Cdfski is the cumulative distribution of Pdfski and, for instance, a = 0.95. 
Therefore, if the net signal is zero and the gross count is only background, a net count 
greater than Lc would be obtained with a low probability 1 — a. The net signal 
is assumed detected if uq > Lc and non-detected otherwise. The detection limit, 
Lz) = X, is the solution of 

Cdfski(ic|AB+a;,AB) =/3, (4) 
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where, for instance, /3 = 0.05. It is worth noting that a prior knowledge of the 
background signal Ab is again assumed. The figure [T] (left) illustrates the procedure; 
if the net signal is more than L]j, at least the 95% of the net counts are more than 
Lc- 

We can circumvent the need to know Ab in advance by using the Neyman's 
construction [Sj |4] ; a review can be found in [13j . Actually, this construction produces 
confidence regions for the (Ab, A) pair, but, for the sake of simplicity, we fix the value 
of Ab and calculate a confidence interval for the net signal alone. To this end, following 
Neyman, we introduce a pair of continuous and monotonic functions of A, ni(A) and 
n2(A), so chosen as [ni,n2] is an a-interval for n. That is, 

Cdfskl(?^2|AB + A, Ab) - Cdfski(?ii|AB + A, Ab) = a. (5) 

Provided the net count is in the domain of the inverse functions Ai ~ 712^ {n) and 
A2 = n^^{n), 

Prob(A G [Ai,A2]|A) = a, (6) 

by construction and whatever the measurand value may be. Hence, [Ai,A2] is the 
sough a-interval. The figure [1] (right) illustrates the procedure in the case when 
a = 0.90 [13]. According the Neyman's viewpoint, in a long series of repeated 
measurements with fixed gross Aq and background Ab values, the 90% of the intervals 
calculated as indicated by the arrow will contain the measurand A = Aq — Ab . 

3.1. Conceptual limits 

The Currie's constructions of the critical- and detection-limit rely on the prior 
knowledge of the background signal, which is not available. In practice, the 
background count tt-b substitutes for Ab, but this does not remove the conceptual 
difficulty. 

An alternative is to use the net count n ~ uq — riB to determine the Neyman's 
upper limit of the net signal. However, since the sampling distribution of n depends 
also on Ab, the Neyman's upper limit is still a function of the background signal. 
Additional troubles arise when nc < tib , because the unbiased and minimum- variance 
estimate of A is negative and unphysical [13]. 

4. Bayesian analysis 

The problems inherent in the classical analysis can be solved by Bayesian inferences. 
They are based on the product rule of probabilities 

F bg(Ab, AG|nB,nG)^BG(?^B,'^G) = PBG(?^B , ^^G 1 Ab , AG)7r(AB , Aq) , (7) 

where 7r(AB,AG) is the joint probability distribution of the signal values prior the 
data are available, rBG(AB, AG|?^B, "-g) is the joint probability distribution of the 
signal values - given the signal-plus-background hypothesis and after the data were 
collected, the likelihood that the signals are Ab and Ag is the sampling distribution 
Pbg('^b, 't-gIAb, Ag) evaluated in and uq, and the evidence of the data model is 
the probability distribution Zbg('^b, no) of the data, no matter what the signals may 
be. 
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Pre-data distribution 

A key step to calculate Fbg(-'^B: AqItib, uq) is to assign the density 7r(AB, Aq) in the 
(Ab, Ag) points of the signal- value space prior the measurement results are available. 
In fact, the only way to assign probabilities to the signal values consistent with the 
measurement results is to update, according the Bayes theorem, the assignments made 
before the data are at hand. These prior assignments must embed all the information 
available, but, to avoid inferences affected by non-available data, no more than this 
information must be used. 

By using the product rule of the probability algebra we can write 

7r(AB, Ag) = 7rG(AG|AB)7rB(AB), (8) 

where 7rB(x) and TTcix) have the same functional form, say, 7r(x), both the signals are 
strictly positive, and Ag > Ab- Eventually, tt must be uninformative. Therefore, we 
impose scale invariance [16]. Hence, if Tr{x) = f{x), then 7r'(fcx) = f{kx) no matter 
what the k value may be, where 7r'(fcx) = f{x)/k is the probability distribution of 
x' = kx. This ensures that the functional form of tt is independent of the duration of 
the counting interval. The reason is that, otherwise, we will embed into tt - through 
a specific /-choice ~ an information about this duration. Scale invariance limits the tt 
choice to the Jeffreys' t:{x) oc l/cc distribution. 

In the [0, oo] support, this distribution is not normalizable; therefore, we limit its 
support to < Aiiiin < Ab < A^ax and Ab < Ag < A^ax so that 

. N _ If(Amin < Ab < A 

max )If (Ab < Ag < a 

max ) 

^' ""'^ AbAg ln(A,„ax/A„,i„) ln(A^ax/AB) ' ^ ' 

where If(n) is one if its argument is true and zero otherwise. Since this distribution 
does not allow us to calculate analytically the normalization integrals we will found 
in the following, it will be approximated as 

< Ab < Amax)If(AB < Ag < A 

max J /-I r\\ 

7rAB,AG = ■ ■ ■ 2/A 77 ^ ■ 10 

AbAg ln^(A,„ax/A„ii„) 

The limits for the support extending from zero to the infinite will be discussed where 
appropriate. The pre-data distribution of the net signal is the marginal distribution 

/•oo /"OC 

7rs(A) = / / (5(Ag - Ab - A)7r(AB,AG)dABdAG 
Jo JAb 

^ 2 [ln(Ainax - A) + ln(Ainin + A) - In(AinaxAmin)] 

Aln2(Amax/A 

min } 

where < A < Amax — Amin and the Dirac delta function S{Aq — Ab — A) is the 
distribution of A conditional to the Amax and Amin values |17j . 

4-2. Post- data distributions 

By combining (|Tb|) . ([7]) , and ((TO)) , the joint probability distribution of the background 
and gross signals after the data have been collected is 

A , N nBA"«-iA"<^-ie-(AB+AG) 
Fbg Ab, Ag nB, nG = 7 TTT^rT^ n TT' 

where 2Fi{a,b;c;z) is the hypergeometric function, Zbg has been obtained by 
normalization, [Amin, Amax] has been chosen large enough that the integration limits 
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can be extended from zero to the infinity, > 0, uq > 0, and Aq > Ab > 0. Tlie 
post-data distribution of the net signal is the marginal distribution, 

POO POO 

rs(A|nB,nG) = / / (5(Ag - Ab - A)rBG(AB,AG|nB,nG)dABdAG 

r"° nBA^'^'^(AB + A)"G-ie"(^^B+A) ^Ab 
Jq {ub + riG - l)!2^'i(?T-B,n-B + riG;nB + 1; -1) 
^ nBe-^A"B+"'^~iU(nB, riB + ^g + 1, 2A) ^^^^ 
(riB + - 1)! 2Fi(nB,nB + uq; ub + 1; -1) ' 

where tt-b > 0, uq > 0, A > 0, and U(a, 6, z) is the confluent hypergeometric function. 
Representative values for example, the mode, mean, or median - and confidence 
intervals can be calculated from (|f 3p . but, contrary to a Neyman's interval, a Bayesian 
interval is such that, in a long series of repeated measurements of different net signals 
giving the same net count n, a given fraction of the net signals in it. 



4-3. Model selection 

The no-signal hypothesis means that the joint sampling distribution of the data is 

/\"B-l-nGp-2AB 

VBB{nG,nB\hB) = ^ f-. . (14) 

Consequently, given the no-signal hypothesis, the post-data probability distribution 
of the background signal is 

A"B-l-nG-lQ-2AB 

rBB(AB|nG,nB) = 2"B+«o(„B _ - l)!(nB + nG - 1)!' ^^^^ 

where Zbb has been obtained by normalization, [Amin , Amax] has been chosen large 
enough that the integration limits can be extended from zero to the infinity, Ab > 0, 
and Ub + no > 0. 

To chose between the signal-plus-background and no-signal hypotheses, Hbg ^^nd 
Hbb; that is, between the joint sampling distributions (fT&l and p^ . we need the 
probability that each hypothesis is true given ub and tig [9]. On the assumption that, 
before the data are available, the probabilities of the two hypotheses are the same, the 
post-data probabilities of -ffBG and Hb b are proportional through the same factor to 
the evidences 



^1 



»ax 2A^'^-^A^'^-^e-(^B+Ac) dABdAG 
71b! J^g! ln^(Amax/A 

min ) 

2(nB -l-nG - l)!2fi(»B,»B + nG\nB + 

"B^sinG! ln^(Amax/Ainin) 



(16a) 



and 

K7^---\~'^-dKB {uB+nG-iy. 



ubI ugI ln(Amax/A„i„) 2"B+"GnB! tig! ln(An,ax/A„ 

where [AminjAmax] has been chosen large enough that the integration limits can 
be extended from zero to the infinity and ub > 0, ug > 0. In (|16ap and (jl66p . 
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Table 1 . Background and gross counts for the measurements of the amounts of 
Au, La, and As in a sample of the natural silicon crystal WASO04 by neutron 
activation analysis. The counting interval was 2 h. The 95% critical and detection 
limits have been calculated according to the Currie's constructions with the 
assumption Ab = «b- 



element 


reaction 


energy 


"B 




n 


Lc 


Ld 






keV 


counts 


counts 


counts 


counts 


counts 


Au 


i9'^Au(n,7) "*Au 


411.67 


324 


500 


176 


42 


88 


La 


"9La(n,7) i«La 


487.02 


306 


284 


-22 


41 


85 


As 


^5As{n,7) ''^As 


559.10 


296 


311 


15 


40 


84 



In (A,„ax/Aniin) and ln(A 
space 0161 mil]. Hence, 

Prob(i7BGl"B,"-G) 



c/Amin) are Ockham's penalties for the size of the signal 



Zbc 



Zbg + Zbb 

2"«+"'^+SFi(nB,nB 



(17a) 



nQ-^riB + 1;-1) 



2nB+nG + l 2i^i(nB, riB + nc; "B + 1; -1) + riB ln(Ainax/Amin) 



and 

Prob(i7_Bs|nB,nG) 



Zbi 



Zbg + Zbb 



{in) 



UB ln(Ai„ax/Ai„in) 



2»B+nG+i 2Fi(nB, riB + tig; + 1; -1) + ?iB ln(A,nax/Ami„) ■ 
The support of the prc-data distribution must be bounded to a non-null lower 
limit and a finite upper limit. On the contrary, Prob(iJsB|'^Bj t^g) tends to one and 
Prob(iJBGl"-Bj t^g) tends to zero. This paradoxical result is caused by the largest 
parameter space of the Hbg hypothesis and, consequently, its largest Ockham's 
penalty. This could appear a limitation; however, a [Amin, Amax] choice can be made on 
the basis of the background information. In addition, from a numerical viewpoint, the 
logarithm function maps huge [Amin, Amax] intervals into negligible Ockham's factors. 



5. Application example 

As an application example, we consider the measurements of the amounts of Au, La, 
and As in a sample of the natural silicon crystal WASO04 by neutron activation 
analysis |12| . Zooms of the emission spectra in the neighbours of the channels 
corresponding to the energies of the 7 rays emitted in the de-excitation of the activated 
nuclei are shown in Fig. [21 All the photons collected in the bins included in each peak 
(chosen as five times the calibrated full peak half width) were added to obtain the gross 
counts. The background counts were estimated by adding all the photon collected in 
an equal number of tail channels fairly subdivided between in the left and right tails. 
The relevant reactions, peak energies, and background, gross, and net counts are given 
in table [1] The 95% critical and detection limits have been calculated according to 
the Currie's constructions ^ and (jj]); they are shown in table [H Their meanings are 
as follow: if the net signal is zero, the probability that the net count is less than Lc 
is 0.95; if the net signal is more than L/j, the probability that the net count is more 
than Lc is 0.95. Accordingly, only a gold contamination has been detected. 
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Figure 2. Zooms of the emission spectra in the neighbours of the channels 
(indicated by the arrows) corresponding to the energies of the 7 rays emitted in the 
de-excitation of the activated Au, La, and As nuclei. The shaded areas indicate the 
peak widths. The horizontal lines indicate the background counts, as estimated 
from the peak tails; the line lengths indicate the tail-channels considered. 



The unbiased minimum-variance estimates of the gold, lanthanum, and arsenic 
net-signals are n(Au) = nG(Au) — nB(Au), n(La) = nG(La) — nB(La), and n(As) = 
nG(As) — nB(As); they are given in table [2] together with the relevant standard 
deviations. The standard deviations have been calculated by using the Skellam 
distribution ([2]) , where the estimates ub and uq of the background and gross signals 
have been used. The hypothetical Skellam sampling-distributions of the net counts 




Figure 3. Left: sampling-distributions of the unbiased minimum-variance 
estimates of the net signal for Au, La, and As. Right: post-data distributions 
of the net signals. The filled curve is the prc-data distribution l llll l. 
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Figure 4. Contour plots of the joint post-data distributions of the background 
and gross signals for Au, La, and As. The white areas are excluded by the prior 
information Aq > Ab- 

are shown in Fig. [3] (left). To calculate the actual sampling distributions would require 
knowing the the background- and gross-signal values in advance; in Fig. [3l they were 
set equal to the background- and gross-signal counts with the exception of the ?T.(La) 
distribution, where both were set equal to the [nG(La) -I- nB(La)]/2 mean. It is worth 
noting that n(La), though a perfectly legitime unbiased estimate of A (La), is negative 
and non-physical. Table [5] gives also the 95% Ncyman upper-limits of the net signals, 
which have been calculated for Ab = ne- Their meaning is as follow: in a large set 
of measurement repetitions, 95% of upper limits so calculated arc more than the net 
signal. In this frcquency-of-occurrence sense, the probability that the net signal is less 
than the Neyman upper-limits is 0.95. 

The Bayesian joint post-data distributions of the background and gross signals are 
shown in Fig. |31 where the support of the pre-data distribution is from A,nin = 10~^ 
to Amax = 10^. The relevant marginal distributions of the net signals are given in Fig. 
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Table 2. Bayesian inferences: median net-signals and 95% upper limits for 
the Au, La, and As fractions in the WASO04 sample; the sub and super 
scripts indicate the first and third quartiles. Frequcncy-of-occurrence viewpoint: 
unbiased estimates (in parentheses are the standard-deviations) and 95% Neyman 
upper limits of the net signals. The Neyman upper limits have been calculated 
by assuming that rtB = Ag. 



element 


Bayesian inferences 
median 95% interval 
counts counts 


frequency-of-occurrence 
n 95% interval 
counts counts 


Au 
La 
As 


24t{t 


< 223 

< 35 

< 59 


176(29) 
-22(24) 
15(25) 


< 225 

< 20 

< 57 



[3] (right). For a comparison, Fig. |3] shows also the pre-data distribution (ITT|) . 

The probabihtics of both the detected and non-deteeted statements have been 
calculated according to the evidence of the relevant data models; the results are 
given in table |3l The gold contamination is evident; the lanthanum and arsenic 
contamination are very uncertain. Table [2] gives the median of the possible net- 
signals values together with the 25% and 75% quantiles. This table gives also the 95% 
Bayesian upper-limits of the net signals, whose meaning is as follow: in a large set of 
measurement repetitions giving the same background and gross counts, 95% of the net 
signals (in principle, different) are less than the limit so calculated. It must be noted 
that the Bayesian median of the possible A(La) values is positive; further discussions 
of the Bayesian inference of a positive quantity from a negative measurement result 
can be found in [18l [19] . 

Nevertheless their different conceptual meanings - the median of the net-signal 
value-space and a net-signal measure drawn from an unbiased minimum-variance 
population of net-signal estimates ~ Bayesian estimate and frequency-of-occurrence 
measure of A(Au) are numerically the same. The same is true for the relevant Bayesian 
and Neyman' confidence intervals, though the first refers to an ensemble of different 
net-signal values but the same background and gross counts and the second refers 
to an ensemble of different intervals calculated from different background and gross 
counts but the same net-signal value. The reason is that both the approaches rely on 
similar, quasi-Gaussian, probability distributions and that the prior information was 
irrelevant. Contrary, significant differences are evident when the net count approaches 
zero or it is negative. 



Table 3. Evidences and probabilities of the detected and non-detected 
hypothesis. The support of the prior distribution of the background and gross 
signals is from Amin = 10~* to Amax = 10*. 



hypothesis 


evidence 


probability 


evidence 


probability 


evidence 


probability 






Au 




La 




As 


detected 


3.6 X 10- 


-8 100% 


1.2 X 10" 


■8 1% 


4.7 X IQ- 


'8 2% 


non-detected 


1.1 X 10" 


0% 


2.0 X 10" 


99% 


2.4 X IQ- 


98% 
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6. Conclusions 

We showed that probabiUty calculus and Bayesian inferences offers a solution to the 
problem of deciding between the signal-plus-background and no-signal hypotheses, 
when looking for quantities whose magnitude is comparable with the background noise 
of the measurement procedure. Given the measurement results, having been calculated 
the probabilities of the detected and non-detected hypotheses, optimal decisions 
follow. For instance, having the signal-plus-background model been selected, a 
measurand value can be optimally chosen according to the post-data probabilities of its 
possible values. As regards the detection-limit estimate, the Neyman approach focuses 
attention on the data processing and it is concerned in finding a statistics capable of 
a pre-determined performances in the set of the results of repeated measurements 
of the same measurand. The Bayesian approach - which focuses attention on the 
measurand-value probabilities - is concerned in the set of different measurand values 
consistent with repeated measurements giving the same result. 
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